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Data automata on data words is a decidable model proposed by Bojariczyk et al. in 2006. Class 
automata, introduced recently by Bojaiiczyk and Lasota, is an extension of data automata which uni- 
fies different automata models on data words. The nonemptiness of class automata is undecidable, 
since class automata can simulate two-counter machines. In this paper, a decidable model called 
class automata with priority class condition, which restricts class automata but strictly extends data 
automata, is proposed. The decidability of this model is obtained by establishing a correspondence 
with priority multicounter automata. This correspondence also completes the picture of the links be- 
tween various class conditions of class automata and various models of counter machines. Moreover, 
this model is applied to extend a decidability result of Alur, Cemy and Weinstein on the algorithmic 
analysis of array-accessing programs. 



1 Introduction 

With the momentums from the XML document processing and the statical analysis and verification of 
programs, formalisms over infinite alphabets are becoming a research focus of theoretical computer 
science (c.f. |6| for a survey). 

The infinite alphabet means £ x D, with Z a finite tag set and D an infinite data domain. Words and 
trees with the labels of nodes from the infinite alphabet £ x D are called data words and data trees. For- 
mally, a data word is a pair (w, 7i), with w denoting the sequence of tags and TT denoting the corresponding 
sequence of data values. Data trees can be defined similarly. 

Among various models of logic and automata over infinite alphabets that have been proposed, data 
automata were introduced by Bojariczyk et al. in 2006 to prove the decidability of two-variable logic on 
data words (H). 

A data automaton & consists of two parts, a nondetermiiustic letter-to-letter transducer £/ :£*—;■ F*, 
and a class condition which is a finite automaton ^ with the alphabet F. & accepts a data word (w, 7i) 
iff from w, £/ is able to produce a F-string w' such that, 

for each class X of (w, n) (a class of a data word is a maximal set of positions with the same 
data value), ^ has an accepting run over w'\x (the restriction of w' to the positions in X). 

Several extensions of data automata have appeared in the literature. 

Extended data automata, was proposed by Alur, Cerny and Weinstein in 2009, in order to analyze 
the array-accessing programs (HI)- Extended data automata extend data automata by the class condition, 
which is now a finite automaton ^ with the alphabet FU {0}. & accepts a data word (w, 7l) iff from w, 
£/ is able to produce a F-string w' such that, 
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for each class X of (w, n), ^ has an accepting run over w' ®X, where w' ®X\% the string 
in (ru {0})* obtained from w' by replacing each letter w- such that / X by (note that 
W ®X has the same length as w')- 

However, as shown in fTT], it turns out that extended data automata are expressively equivalent to data 
automata, thus they are a syntactic extension, but not a semantic extension of data automata. 

Another extension of data automata, class automata, was proposed by Bojariczyk and Lasota in 2010 
to capture the full XPath, including forward and backward modalities and all types of data tests (f3l). 

Class automata generalize both data automata and extended data automata by the class condition, 
which is now a finite automaton with the alphabet F x {0, 1}. Qi accepts a data word (w, Tl) iff from 
w, is able to produce a F-string w' such that, 

for each class X of (w, %), SB has an accepting run over w' ®X, where w' (8)X is the string in 
(F X {0, 1})* obtained from w' by replacing each letter w- by (w-, 1) if / G X, and by (w-,0) 
otherwise. 

In ||3l, Bojariczyk and Lasota also defined various class conditions of class automata and established 
their correspondences with different models of counter machines, including multicounter machines with 
or without zero tests, counter machines with increasing errors, and Presburger automata. 

Besides the models of counter machines considered in [T], there is still another type of counter ma- 
chines, called priority multicounter automata, proposed by Reinhardt in his Habilitation thesis (liSll). 
where he showed that the nonemptiness of priority multicounter automata is decidable. Priority multi- 
counter automata were also used by Bjorklund and Bojanczyk to prove the decidability of two- variable 
first order logic over data trees of bounded depth ([2|). 

A priority multicounter automaton (PMA) is a multicounter automaton M with the restricted zero 
tests: The n counters in M are ordered as Ci , . . . ,C„. M can select an index / < n, and test whether for 
each j < i, Cj = 0. 

In this paper, we propose a new type of class condition for class automata, called priority class con- 
dition, and show its correspondence with priority multicounter automata, thus showing the decidability 
as well as completing the picture of the links between class automata and counter machines estabhshed 
by Bojariczyk and Lasota. 

The main idea of the priority class condition of class automata is roughly as follows: 

Let ^ = {£/,S§) be a class automaton such that the output alphabet of the transducer £/ is 
F. Then a priority class condition is obtained by putting an order (priority) over the letters 
7 G F and using this order to restrict the (7,0)-transitions of ^. 

In this sense, a data automaton is a class automaton with priority class condition (PCA) in which all 
the (7,0)-transitions are self-loops, while an extended data automaton is a PCA in which the different 
7's are non-distinguishable in (7,0)-transitions. 

With respect to the closure properties, we show that PCAs are closed under letter projection and 
union, but not under intersection nor complementation. While data automata (and the expressively equiv- 
alent extended data automata) are closed under letter projection, union and intersection, it turns out that 
PCAs strictly extend data automata and still preserve the decidability. 

In addition, we demonstrate the usefulness of PCAs by applying them to generalize a decidability 
result of Alur, Cemy and Weinstein on the analysis of array-accessing programs (Q)- 

This paper is organized as follows. In Section 2, some preliminaries are given. Then in Section 3, 
the concepts of 0-priority finite automata and 0-priority regular languages are introduced and PCA is 
defined. In Section 4, the correspondence between PCA and PMA is established. Section 5 discusses the 
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application of PCAs to the algorithmic analysis of array-accessing programs. All the missing proofs can 
be found in the full version of this paper (13). 

2 Preliminaries 

In this paper, we fix a finite tag set E and an infinite data domain D, e.g. the set of natural numbers N. 

A word w over £ is a function from [«] = {1, . . . ,«} to £ for some n> I. Suppose w : [n] — )• Z is a 
word, then |w| is used to denote the length of w, namely n. If in addition X C [n], then w\x is used to 
denote the subword of w restricted to the set of positions in X. A language is a set of words. 

A data word is a pair (w, 7i), where w is a word in £* of length n and tt :[«]—;• D. A class of a data 
word (w, 7i) (of length n) corresponding to a data value J G D is a collection of all the positions / G [n] 
such that 7i{i) = d. For instance, the class of the data word {a,0){b, l)(c,0) corresponding to the data 
value is { 1 , 3}. A data language is a set of data words. Let L be a data language, the language of words 
corresponding to L, denoted by str{L), is {w \ {w, k) G L}. 

A data automaton Qi consists of two parts, 

• a nondeterministic letter-to-letter transducer :£*—)• F*, 

• and a class condition, which is a finite automaton over the alphabet F. 

A data automaton Q = (=2/,^) accepts a data word (w, tt) iff from w, is able to produce a string 
w' G F* (with the same length as w) such that for each class X of (w, Tl), ^ has an accepting run over 
w'\x- The set of data words accepted by S) is denoted by ££{^^. 

Class automata & = (i/, 3S) is an extension of data automata with the class condition changed into 
a finite automaton over the alphabet F x {0, 1}. 

A class automaton & = (iz/, ^) accepts a data word (w, 7i) iff from w, is able to produce a F-string 
w' such that for each class X of (w, n), ^has an accepting run over w'fS'Z, where w'(5DX G (Fx {0, 1})* is 
obtained from w' by replacing each letter w- by (w-, 1) if / G X, and by (^-,0) otherwise, e.g. if w' = abc 
and X = {1,3}, then w' ®X = {a, l){b,0){c, 1). The set of data words accepted by ^ is denoted by 
^(^). 

A multicounter automaton is a hexa-tuple (2,£,fc, 5,qo,F) such that 

• 2 is a finite set of states, 

• £ is the finite alphabet, 

• kis the number of counters, 

• 5 ^ g X (ZU {e}) X L X 2 is the set of transition relations over the instruction set L = {inci,deci, ifzi 
l<i< k}, 

• qo is the initial state, 

• F is the set of accepting states. 

Let 'I0 = {Q,'L,k,5,qo,F) be a multicounter automaton. A configuration of is a state together 
with a list of counter values, namely, an element from Q x N*^. A configuration {q' ,c') is said to be 
an immediate successor of {q,c) induced by a letter a G SU {e} and an instruction / G L, denoted as 

{q,c) — > {q',c'), if {q,o,l,q') G d and 

• if / = inci, then c- = c, ■ + 1 and c'j = cj for j 7^ /, 

• if / = decj, then c,- > 0, c - = c,- — 1, and = for j ^ i, 
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• if / = ifzi, then c, = and c'j = cj for each j : I < j < k. 

A run of over a word w is a nonempty sequence (^0,0)) ^''^'> (gi ,cT) . . . °'"'^"> {qn^c^) such 
that w = ai . . . a„. A run is accepting if qn ^ F. accepts a word w if there is an accepting run of 
over w. 

A priority multicounter automaton (abbreviated as PMA) is a multicounter automaton ^ with the 
following restricted zero tests: 

The k counters in are ordered as Ci, . . . ,Q. can select some index / < k, and test 
whether for each j < /, the counter Cj has value 0. 

Namely, a priority multicounter automaton is the same as a multicounter automaton, except that the 
instruction set L is changed into {inci,deci,ifz<i \ I < i <k}. 

Theorem 1 (||5l). The nonemptiness of priority multicounter automata is decidable. 



3 Class automata with priority class condition 

Intuitively, class automata with priority class condition are obtained from class automata by restricting 
the class condition to 0-priority regular languages defined in the following. 

We first introduce several notations. 

Let l3S = (2,r X {0, 1}, 5,^0,^) be a deterministic complete finite automaton over the alphabet Y x 
{0, 1}. We use the notation q — ^ q' to denote the fact that 5(<7, {Y,b)) = q' , where b = 0,\, and q — > q' 

to denote the fact that q' is reachable from q in the transition graph of The transitions q —4 q' (resp. 

(7.0) 

q —4 q') are called the one-transitions (resp. zero-transitions) of 

Let Go be the directed subgraph of the transition graph (2, 5) obtained from {Q, 5) by restricting the 
set of arcs to those labeled by letters from T x {0}. Formally, Go = (2, 5 n (2 x (F x {0}) x Q)). We 
use the notation q q' to denote the fact that q' is reachable from ^ in Go. 

A state <7 G 2 is called 0-cyclic if q belongs to some nontrivial (containing at least one arc) strongly- 
connected component (SCC) C of Go. Otherwise q is called 0-acyclic. 

For each 7 G F, let G(y o) be the directed subgraph of [Q, 5) obtained from {Q, 5) by restricting the 
set of arcs to those labeled by (7,0). Formally, G(^o) = iQ,5n{Qx {(7,0)} x Q)). The out-degree of 
each vertex in G(y 0) is exactly one, thus it has a simple structure: Each connected component of G(y o) 
consists of a unique cycle and a set of directed paths towards that cycle. 

Let 7 G F. The cycles in G(yo) called the {Y,0)-cycles of If a state q belongs to some (7,0)- 
cycle in G(y o)» then q is called a {Y,0)-cyclic state, otherwise, it is called a {Y,0)-acyclic state of Note 
that (7,0)-acyclic states may be 0-cyclic. 

Example 2. An example of the deterministic complete automaton SS over the alphabet {a,^} x {0, 1} is 
given in Figure^a). Its associated Go and G^h Q-^ are given in Figure^b) and Figure^c) respectively. 
The state qo and q2 are both 0-cyclic and {b,0)-cyclic, while qi is 0-cyclic but (b,0)-acyclic, since qi 
belongs to a cycle in Go and does not belong to any cycle in G(/, o)- 

Definition 3 (((71,0), (72,0))-pattern). Let 71,72 e F. A ((71,0), (72,0))-pattern in is a state-tuple 
(^1,^2)^3,^4) such that qi ^-^^ qi—^q^ ^-^'^4, q\ is 0-cyclic, and qi, is (72,0)-acyc//c. 
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Figure 1: Automaton Go and G(/, o) 



(6,0) 




(c) 



Example 4. For ^/ze automaton in Figure 
{b,0)-acyclic, it follows that {qi,qi,qi,qo) is a ((<3,0), (b,0))-pattem in 



l\a), because q\ ^——l q\ ^0. 0-cyclic and 



Definition 5 (0-priority finite automata and 0-priority regular languages). Let -SS be a finite automaton 
over the alphabet Y x {0,1}. Then ^ is called a ^-priority finite automaton if is a deterministic 
complete automaton such that 

the letters inV can be ordered as a sequence Y\_Y2---Yk satisfying that there are no ((^-,0), (7^,0))- 
patterns with i > j in 

A regular language L C (F x {0, 1})* is called a 0-priority regular language if there is a 0-priority 
finite automaton over the alphabet F x {0, 1} accepting L. 

Now we state several properties of 0-priority finite automata and 0-priority regular languages. 

Proposition 6. Let ^ = (2,F x {0, 1}, 5,<7o,F) be a deterministic complete finite automaton. Then ^ 
is a 0-priority finite automaton iff ^ satisfies the following two conditions, 

L for any 7 G F, there are no ((7,0), {Y,0))-patterns in SS; 

2. for any 71,72 G F such that Y\ 7^ 72. if there is a {{Y\,0),{Y2^0))-pattern in then there do not 
exist ((72,0), {Yi,0))-patterns in 

Corollary 7. Given a deterministic complete automaton over the alphabet F x {0, 1}, it is decidable 
in polynomial time whether is a 0-priority finite automaton. 

For each nontrivial SCC, strongly-connected-component, C of Go, let Lq denote the set of labels 
(7,0) of the arcs belonging to C. 

Proposition If ^ is a 0-priority finite automaton, then Go enjoys the following two properties. 

(r,o) 



1. Suppose that q\ q2 such that q\ is 0-cyclic, then q2 is {Y,0)-cyclic. 

2. For each nontrivial SCC C of Go and each (7,0) G Lc, every state in C is (7,0)-cjc//c. 

From Proposition [8} the following property can be easily deduced. 

Corollary 9. Let ^be a 0-priority finite automaton. If a state q is reachable from some 0-cyclic state in 
SS, then q is 0-cyclic as well. 

In other words, the above corollary says that 0-acyclic states cannot be reached from 0-cyclic states 
in a 0-priority finite automaton. 
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Proposition 10. Let L C (r x {0, 1})* be a regular language. Then L is a 0-priority regular language iff 
the unique minimal deterministic complete finite automaton ^ accepting L is a 0-priority finite automa- 
ton. 

Definition 11 (Class automata with priority class condition, PCA). A class automaton (i?/,^) is said 
to have priority class condition, if the alphabet T can be partitioned into k (k > I) disjoint subsets 
Ti,.. . ,Tic such that J£{^SS) is a union of languages L[,. . . ,Lk satisfying that Li C (r,- x {0, 1})* is a 
0-priority regular language for each i: \ <i <k. 

Intuitively, a class automaton ^ = (=2/,^) with priority class condition is a class automaton such 

that 

over a data word (w, 7i), nondeterministically chooses an index i : I < i < k, then produces 
a word w' G F*, and verifies that each class string w' belongs to the 0-priority regular 
language L,-. 

Remark 12. In the definition of PC As, ^(i^) is defined as a disjoint union of 0-priority regular lan- 
guages, instead of a single 0-priority regular language. PCAs defined in this way can be shown to be 
closed under union (c.f. Proposition 15), while preserving the decidability (Theorem 18 ). 
Example 13. Let ^ be the class automaton (iz/, ^) such that si is the identity transducer and !^ is the 
automaton over the alphabet {<3,^} x {0, 1} in Figure^a). Then accepts the data words satisfying 
the property "between any two occurrences of the letter a with the same data value, there is a letter b 
with a different data value". If {a,b} is ordered as ab, then there are no {{a,0),{a,0))-pattems, nor 
{{b,0), {a,0))-patterns, nor ({b,0) , {b,0))-patterns, in Thus ^ is a 0-priority finite automaton under 
the ordering ab, so is a PCA. 

(r,o) 

Remark 14. Data automata can be seen as PCAs by adding self-loops q — — > q. Moreover, the extended 

data automata introduced in /|7]/ can also be seen as a special case of PCA. In extended data automata, 

the class condition is a finite automaton ^ over the alphabet Tu{0}, where the letters in T are omitted 

in zero -transitions. Without loss of generality, ^ can be assumed to be deterministic and complete, then 

a deterministic complete finite automaton over the alphabet Y x {0, 1} can be defined as follows: 

7 (7-0) 

q — > q' in iffq — > q' in SS, and q — f cf in iff q — \ cf in S^. In the subgraph Gq of different 

letters (7,0) are non-distinguishable, so Go has the same structure as G(7,o) fa^ '^^J 7 S T. Therefore, 

SS' is a 0-priority finite automaton under any ordering of letters in T, and extended data automata can 

also be seen as PCAs. 

Proposition 15. The class of data languages accepted by PCAs are closed under letter projection and 
union, but not under intersection nor complementation. 

The fact that PCAs are not closed under intersection is proved by contradiction: If PCAs are closed 
under intersection, then PCAs are able to simulate two-counter machines, thus become undecidable, 
contradicting to Corollary [19] in the next section. 

Since data automata are closed under both union and intersection, it can be deduced that PCAs are 
strictly more expressive than data automata. 

Corollary 16. Class automata with priority class condition are strictly more expressive than data au- 
tomata. 

Remark 17. From Corollary^^ we know that there is a data language recognized by PCAs, but not by 



data automata. It would be nice if we could prove for instance that the data language in Example 13 



namely, "Between any two occurrences of the letter a of the same data value, there is an occurrence of 
the letter b with a different data value ", cannot be recognized by data automata. This is stated as an 
open problem in this paper. 
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4 Correspondence between PCA and PMA 

The aim of this section is to show that a correspondence between PCAs and PMAs can be established so 
that the decidability of the nonemptiness of PCAs follows from that of PMAs. 

Let prj :£—)•£'□ {e}, then the projection of a data word (w, n) under prj, denoted by prj{{w, n)), 
is prj{wi) . . .prj{w\„\), and the projection of a data language L, denoted by prj{L), is {prj{{w,7i)) \ 
(w, n) £ L}. Note that the projection of a data language is a language, not a data language. 

Theorem 18. The following two language classes are equivalent: 

• projections of data languages accepted by PCAs, 

• languages accepted by PMAs. 

Corollary 19. The nonemptiness of PCAs is decidble. 

We prove Theorem[T8]by showing the following two lemmas. 
Lemma 20. For a PCA a PMA ^ can be constructed such that J^{^) = str{^{Si)). 



From Lemma 20 it follows that the first language class in TheoremfTS^is included in the second one, 



since the class of languages accepted by PMAs is closed under mappings prj : Zi — )• £2 U {£}. The next 



lemma says that the second language class in Theorem 18 is included in the first one 



Lemma 21. For a given PMA a PCA & can be constructed such that ^(^^ is a projection of 



The rest of this section is devoted to the proof of the Lemma [20] The proof of Lemma |2T] is omitted 
and can be found in the full version of this paper (|T|). 

The idea of the proof is to consider the abstract runs of class automata, simulate them by multicounter 
automata, and illustrate that the simulation can be fulfilled by a priority multicounter automaton if the 
priority class condition is assumed. The proof is inspired by the proof of Theorem 2 in 



4.1 From class automata to multicounter automata 

Let ^ = {s^ be a class automaton, where = {Qg,L,r,dg,qQ,Fg) and ^= (2c, Fx {0, \ },dc,qQ,F^) 
Without loss of generality, we assume that ^ is deterministic and complete. 

Given a data word (w, n), let y{w, 7i) be the set of data values occurring in (w, 7i), namely, S^iw, 7l) = 
{^i I 1 < ' < I w| }, and (w, 7r)<,- be the restriction of (w, n) to the set of positions { 1 ,...,/} for each / < \w\. 

Intuitively, a run of & over a data word (w, tt) is a parallel running of the transducer £^ and the copies 
of the automaton ^ over (w, n), with one copy for each data value occurring in (w, 7i). A run of Si over a 
data word (w,7r) can be seen as a sequence (gf,^j,7i,/?i)(^|,^2)72)-'^2) • • ■ {(f\„\,^\„\,y\w\^R\w\) such that 

• the sequence {q\,Y\) . . . (^^^^| , y\„\ ) corresponds to a run of the transducer 

• ^- records the state of a copy of ^ corresponding to a data value that has not been met until the 
position /, namely, a data value d ^((w, ?r)<;), 

• each time a new data value tt, is met, Ri{n{i)) is set as 5c(^?_j , {ji, 1)), since has not been met 
before and q^j_^ records the current state of ^ for the new data values. 

Formally, A run of S over a data word (w,7r) is a sequence {(f]^,q\,y\,R\) . . .{cf^^^^,q'-^^^,y\„\,R\„\) 
satisfying the following conditions, 

• for each/: 1 </< |w|, (^f_i,w;,'y;',^f) G 5^, 5c(^f_i, (t^-,0)) = (where ^q,^q are the initial states 
of respectively 
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• for each i, is a function from 7r)<,) to Qc, satisfying the following conditions, 

- for each /:!</< \w\, 
Rii^i) = 8c{Ri-i{ni),{Yi,l)) if Ki G y{{w,n)<i-i), otherwise /?/(7r/) = 1)). 
For each (ie ^((w,7r)</_i) such that 7^ Ki, Ri{d) = 5ciRi-i{d),{Yi,0)). 

A run {(f^,ql,Yi,Ri). . . iq^^^^,q\^^,'}\w\,R\w\) is successful if G Fg and G for each d G 

^(w,;r). 

The functions Ri,... ,R\w\ in a run of ^ on the data word (w, ;r) can be abstracted into a sequence of 
functions Ci , . . . , C|n,| such that each Q is a function Qc satisfying that for each q G Qc, Ci{q) is the 
number of data values d G =5^((w, 7l)<i) such that /?;(^/) = q. 

Intuitively, each C, is a tuple of counter values, with one counter for each state in Qc. The sequence 
Ci , . . . , C„ can be seen in a more abstract way, without directly referring to the data values in S^{{w,k)), 
as follows: 

For each 1 < / < jw], Cj is obtained from C,_i by nondeterministically choosing one of the 
following two possibilities: 

• either (corresponding to the situation 71, G =5^((w, 7r)<,_i)) 

- select some counter q' with non-zero value (i.e. C,_i(^') > 0), decrement the 
counter q', 

- then for each counter q", the value of q" is assigned as the sum of those of counters 
p such that 5c{p, iYi,0)) = q", 

- finally increment the counter 5c{q', (Ti, 1))- 

• or (corresponding to the situation Ki ^ S^{{w,7c)<i-i)) 

- for each counter q", the value of q" is assigned the sum of those of counters p 
such that5c{p,{Y,0))=q", 

- increment the counter 5c{q'j_i, {Yi,l))- 

The sequence (^f ,^i,7i,Ci)(^f,^|,72,C2) . . . (^[^| , ^ [^j , > Civ^i ) is said to be an abstract run of & 
over the data word (w, n). 

With such an abstract view of runs, & can be transformed into a multicounter automaton (with zero 
tests) = {QaX,k,5a,qQ,Fa = {qacc}) as follows, 

• Qa includes Qg x Qc and some auxihary states, e.g. for controlling the updates of the counter 

values. 

• ^ consists ofk= \Qc\ counters, one counter for each state in Qc. 

• ill) = ('?o'^o)- 

• Each 7 G r induces a series of transition rules in 8a as follows: 
If 

the current state of ^ is {p^,p'^), the read head is in a position labeled by a G S, and 
there are q^ G Qg,q'^ G Qc such that {p^,a,Y,q^) G 5g and 5c(;7'', (y,0)) = q'^, 

then 

the state of is changed into {q^,q'^), the counter values are updated in such a way to 
obtain Q from C, 1 as above, and the read head is moved to the next position. 

• Nondeterministically, ^ changes the state into a special state qg and repeats the following action: 
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^ arbitrarily chooses a non-zero counter q G 7v, decrements q. Then it tests whether all 
the counters have zero value. If so, changes the state into qacc and accepts. 

We now specify in detail how to update the counter values in essentially, how to perform the 
following updates: 

For each counter q" in 'if, the value of q" is assigned the sum of those of the counters p such 
that (7,0)) = ^". 

Recall that each connected component of G(y,o) of ^ consists of a unique cycle C and several paths 
towards C. Let C = q\ . . .qr, then for each 1 < / < r, the value of the counter is assigned as the sum 
of the value of the counter qi and the values of the counters of its predecessors not in C, where qr+\ = qi 
by convention. Then the counter values can be updated as follows, 

1 . the counters corresponding to the states in C are first renamecQ For each / : 1 < / < r, qi is renamed 
as where = by convention. The renaming is remembered by the finite-state control 
of With this renaming, the counter qt+i takes the value of the counter qi for each i : I < i < r. 

2. then the values of the counters on the paths towards C are updated in a backward way: For instance, 

(7,0) (7,0) 

let pi —4 p2 —4 P3 such that p^ ^ C,pi,p2 C, then the value of p2 is first added into pi,, by 
decrementing p2 and incrementing p^ until the value of p2 becomes zero; afterwards, the value of 
p\ is added into p2, and so on. 

The above updates of counter values of need (unrestricted) zero tests. In the following we will 
show that if is a PCA, then these updates can be done with the restricted zero tests of PMAs, namely, 
testing zero for a prefix of counters as a whole, instead of a single counter. 



4.2 From PCA to PMA 

We first assume that is a PCA such that ^{^) is a 0-priority regular language, and ^ is a 

0-priority finite automaton. Later we will consider the more general case that is a disjoint union 

of 0-priority regular languages. 

We first introduce some notations and prove a property of abstract runs of PCA. 

Suppose that F is ordered as 71 ... // under which =^ is a 0-priority finite automaton. 

Let DscciGo) be the strongly-connected-component directed graph of Go of ^, then Dscc{Go) is an 
acyclic directed graph. Let #scc{Go) denote the maximal length (number of arcs) of paths in Dscc{Go). 

Similar to Lemma 1 in we can obtain the following lemma. 

Lemma 22. Let S) = (=2/,^) be a PCA such that SS is a ^-priority finite automaton. Then any ab- 
stract run of 'S) over a data word (w, %), say 7i,Ci) . . . {q^^^,q''^^^^,'Y\„\,C\„\), enjoys the following 
property: 

For each i : I <i < \w\, the sum ofCi{q') 's such that q' is 0-acyclic is bounded by tt^cc (Go). 

By utilizing Lemma [22j we then demonstrate how the updates of the counter values of the multi- 
counter automaton obtained from in Section l4~T] can be done with the restricted zero tests in PMAs. 
We introduce some additional notations. 

For each i : I < i < I, let Acycj denote the set of 0-cyclic states q&Qc such that q is {yi,Q)-acyclic. 
In addition, let Acjc/^i denote the set of 0-cyclic states q^ |J Acyci by convention. 

iA<i<l 



The idea of renaming is from 1 1 1 
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Proposition 23. Let ^ = be a PC A such that is a 0-priority finite automaton under the 

ordering 7i . . . 7/. Then Acyc\ ,. .. ,Acyci+i satisfy the following two properties: 

1. Acycj C Acyci+\ for each i < /. 

2. For each i : \ < i < I, if q €z Acyci and q — ^ q', then q' Acyci and q' € Acycjfor some j > i. In 
particular, if q G Acycli and q q', then q' ^Acyc\ and q' G Acyci^i. 

We are ready to show that if is a PCA, then can be turned into a PMA = {Qp,L,k, 5p,qQ,Fp). 

From Lemma [22] if is a PCA, then in the multicounter automaton 'if, the sum of the values of 
the counters corresponding to the 0-acycIic states of ^ are always bounded. Thus in 'tfp, the counters 
corresponding to these 0-acyclic states become virtual, in the sense that the values of these counters are 
stored in the finite state control of "^p, and there are no real counters in ^p corresponding to the 0-acyclic 
states of ^. 

The state set of 'tfp consists of the states {p^ , p"^ , ^acvc) and some auxiliary states for updating the 
counter values, where J^Acyc is the information about the virtual counters corresponding to the 0-acyclic 
states of The counters of '^p correspond to the 0-cyclic states of ^, with one counter for each 0-cyclic 
state. 

The counters (corresponding to the 0-cyclic states of ^) of "^p are ordered according to the following 
order of 0-cyclic states of ^, 

Acyci{Acyc2 \Acyci) . . . {Acyci \Acyci-i)Acyci+i, 

where an arbitrary ordering is given to the states within Acyci, Acyci^i, and each of Acyci+i \ Acyci for 
i: I <i <l. 

Each 7 G r induces a series of transition rules in 5p specified in the following. 

If the current state of "^p is {p^,p'^, J'Acyc), the read head is in some position labeled by a, and there 
are q^ G Qg^q'^ G Qc such that (j)^ , a, 7,^^) G 5^ and 5c{jf , (7)0)) = 9'^> then the state of ^p is changed 
into (^^, ^'^j ^cyc)- Now we illustrate how the values of the real counters are updated and how the values 
of the virtual counters, i.e. J^cyc in the finite state control of '^p, is updated into c^^^-c following 
three steps. 

1. Either 

the state p\ = dc{p'', (7, 1)) (a new data value is met) is stored in the finite state control of "^p, 

or 

some (0-acyclic or 0-cyclic) state q' G Qc (an old value is met) is selected, the (virtual 
or real) counter corresponding to q' is decremented, and the state p^ = 5c{q' ,{y,1)) 
(the virtual or real counter corresponding to it should be incremented) is stored in the 
finite-state control of ^p. 

2. The values of the (virtual or real) counters are updated as follows. 
Let 7=7 for some i: \ <i <l. 

The counters corresponding to the states in Acycj \Acycj-\ for j > i, which are (7,0)-cyclic in 
are first updated by renaming, with the renaming stored in the finite state control of "^p. Then for 
each counter q G Acyc\, the value of the counter q is added to its (7,0)-successor q' , which is in 

Acycj \Acyci for some j > / according to the fact that q G Acyc\ C Acyci, q q' and Proposition 



23 Namely, the value of the counter q is decremented and the value of q' is incremented until the 
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value of the counter q becomes zero. Afterwards, for each counter q € Acyc2 \Acyci, the value of 
the counter q is added to its (^•,0)-successor (which is also in Acycj\Acyci for some j > /), and 
so on, until all the counters corresponding to the states in Ac3'c, \Acjc,_i are updated. 
Note that during these updates of counter values, the zero-tests can be restricted to the zero-tests 
for a prefix of counters. The reason is that when updating the counter corresponding to a state 
q &Acycj+\ \Acycj for some j < i, the values of the counters corresponding to the states in 
Acyci,...,Acycj\Acycj-i are already zero. Therefore, testing zero for the counter q is equal 
to testing zero for the counters before q (including q) in the ordering. 

Then, ^Acyc, i-C- the information about the values of the virtual counters, is updated into J^^.^^. 
by following Go, the zero-transitions of and some real counters (corresponding to the 0-cyclic 
states) should also be incremented if they correspond to the (7;-,0)-successors of some 0-acyclic 
states in 

3. If p'l is 0-acyclic, then J^^^^ is further updated by incrementing the value of the virtual counter 
otherwise, the value of the real counter corresponding to the (0-cyclic) state p\ is incremented. 



The definition of the Fp of is similar to Fa of 'if in Section 4.1 
Finally the read head is moved to the next position. 
This finishes the description of 'if p. 

At last, we consider the general case that ^{SS) is a disjoint union of 0-priority regular languages, 
i.e. r is a disjoint union of Fi , . . . ,r/; (/: > 1) such that 

• for each m G £*, outputs a word in Fj U . . . F|, 

• ^(^) is a union of languages Li,...,Lic satisfying that L,- C (F; x {0, 1})* is a 0-priority regular 
language for each /. 

For each /, let F,- be ordered as yti ... yn. under which L,- is a 0-priority regular language. 

For each /, suppose is a 0-priority finite automaton accepting L, and Acjc,- y(l < 7 < + 1) is the 
set of 0-cyclic and (T^ j,0)-acyclic states in 

Then from the PCA a PMA '^ can be constructed such that the counters of 'if correspond to the 
set of 0-cyclic states in all these ^,'s, and these counters are ordered as follows, 

Acyci^i{Acyci^2 \Acyc]_A) ■ ■ ■ {Acycij^ \Acyci,/,_i)Acyci,;,+i . . . 
Acyck,i {Acyck,2 \Acyck,i) ■ ■ ■ {Acyck,i^ \Acyck,k-i)Acyck.ii,+i ■ 

In the PCA ^, after the transducer £/ nondeterministically chooses an index / and outputs a string 
in F? , only the 0-priority finite automaton is used and the other automata for j ^ i remain idle, 
thus the values of the counters before Acyc,- 1 in the above ordering are always zero, and the updates of 
the counter values corresponding to the states Acyci^,. . . ,AcyCij. \ Acyci.^iAcyci.+i can still be fulfilled 
using the restricted zero tests of PMAs. 



5 Application to the analysis of array-accessing programs 

In this section, we demonstrate how to apply class automata with priority class condition to the algorith- 
mic analysis of array -processing programs considered in |lj. The notations of this section follow those 
inyj. 

An array A is a list (A[l].s,A[l].^/) . . . (A[«].5,A[?2].^/) such that A\i].s £ £ and A[i\.d G D for each 
i : I < i < n. 

The syntax of array-accessing programs over an array A are defined by the following rule^ 

^The nondeterministic-choice rule ;/ * then P else P is not included here for simplicity 
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P ::= skip \{P}\b:=B\p:=IE\v:=DE\ 
if B then P else P\fori:=lto length(A) do P\P-P 

where 

• are loop variables, p,pi,... are index variables, v,vi,... are data variables, and 
b,bi,... are Boolean variables, 

• 5, ^1 , • • • G £ and c, ci , • • • G D are constants, 

• IE ::= p \ i aie index expressions, SE :: = s \ A[IE].s are Z-expressions, ::= v | c | A[IE].d are 
data expressions, and B are Boolean expressions defined by the following rules, 

B ::= true \ false \ b \ B and B \ not B \ IE = IE \ IE < IE \ DE = DE \ DE < DE \ SE = SE. 

A state of the array-accessing program P is an assignment of values to the variables in P. 
A Boolean state of the program P is an assignment of values to the Boolean variables in P. 
The initial state of the program P is a state such that 

• all the Boolean variables have value false; 

• all the loop and index variables have value 1 ; 

• all the data variables have the value the same as the first element of A. 

A loop-free program is a program containing no loops, namely a program formed without using the 
rules "for / := 1 to length{A) do P". 

The Boolean state reachability problem is defined as follows: Given a program P and a Boolean state 
m of P, whether there is an array A such that m is reached from the initial state after the execution of P 
over A. 

Restricted ND2 programs are programs of the following form, 

for i:=l to length(A) do 
{ 

PI; 

for j:=l to length(A) do 
{ 

if A[i] .d=A[j] .d then 
P2 

else 

P3 

}; 

P4 

} 

such that 

• PI, P2,P3,P4 are loop-free, 

• P1,P2,P3,P4 do not use index or data variables, 

• P1,P2,P3,P4 do not refer to the order on indices or data. 

Theorem 24 ([1]). The Boolean state reachability problem is decidable for Restricted ND2 programs 
satisfying the following additional condition: 

P3 does not refer to A[j], i.e. it does not contain the occurrences ofA\j].s or A[i].d. 
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The idea of the proof of Theorem [24] is to reduce the Boolean state reachability problem to the 



nonemptiness of extended data automata ^ = {£/,^) (c.f. Remark 14 1 such that 

• £/ guesses an accepting run of the outer-loop of P over an array A, 

• ^ corresponds to the inner loop and verifies the consistency of the guessed run. 
Roughly speaking, ^ can be constructed from P2 and P3 such that 

• P2 corresponds to the one-transitions in 

• P3 corresponds to the zero-transitions in ^. 



The restriction that P3 does not refer to A[j] in Theorem 24 is crucial, because in extended data 
automata, the labels are omitted in zero-transitions of the class condition 

On the other hand, as we have shown, PCAs, i.e. class automata with priority class conditions, do 
not omit the labels in zero-transitions and strictly generalize extended data automata. So naturally, by 
using PCAs, we should be able to show that the Boolean state reachability problem is decidable for a 



larger class of programs than those in Theorem 24 



Similar to the construction of extended data automata from Restricted-A'^D2 programs satisfying the 



additional condition in Theorem 24 we have the following result. 

Lemma 25. For a Restricted-ND2 program P and a Boolean state m, a class automaton Qi = (^,^) 
can be constructed such that m is reached from the initial state after the run ofP over an array A iff the 
array ( data word) A is accepted by 'S. 

In principle, the Boolean reachability problem is decidable for Restricted-A'^D2 programs P satisfying 



the additional condition that the class automaton ^ = (i?/, iM) constructed from P in Lemma 25 is a class 
automaton with priority class condition. However, this condition is in some sense a semantical condition, 
since the construction of the automaton Q from P has an exponential blow-up. In the following, we 
demonstrate how to define a simple syntactic condition for Pl> which guarantees that ^ constructed from 
P is a PCA. 

The ^-priority restricted-ND2 program is a Restricted-A^D2 program satisfying the following condi- 
tion: 

Either PS does not refer to A [7], i.e. it does not contain the occurrences of A [7] .5 or A[i].d, 
or there are a set of constants 5I , . . . , G Z such that P3 is a program of the following form, 

if BB then 

if A [j] . s =sl then 
PAl 

else if A[j].s=s2 then 
PA2 

else if A[j].s=sr then 

PAr 
else skip 
else skip 

such that 

• BB is a conjunction of literals, i.e. b or not b for Boolean variables b, 

• PA\,PA2,. . . ,PAr are compositions of the assignments b := true or b := false for 
Boolean variables b. 
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• Each PAi for 1 < / < r is nontrivial in the sense that there is a Boolean variable b such 
that either is a conjunct of BB and the assignment b := false is in PA/, or not is a 
conjunct of BB and the assignment := true occurs in PAi. 

Remark 26. The 0-priority restricted-ND2 programs subsume the Restricted-ND2 programs satisfying 
that P3 does not refer to A [j]. A slightly more general syntactic condition than the above can be defined, 
which we choose not to present here, since the condition is rather tedious, and we believe that the simple 
condition presented above already sheds some light on the usefulness ofPCAs. 

Example 27. The following program to describe the property "for any two occurrences of the letter a 
with the same data value in A, there is an occurrence of the letter b between them with a different data 
value" (c.f Example\13\ is an example of 0-priority restricted-NDj programs. Intuitively, 

• the Boolean state bl = true,b2 = false, b3 = false corresponds to the state go in Figure^a), the 
Boolean state b\= false, b2 = true,b3 = false corresponds to the state q\, and the Boolean state 
b\ = false, b2 = false, b3 = true correspond to the Boolean state q2; 

• the outer loop selects a position i and the inner loop verifies that the class string corresponding to 
the data value A [/] .d satisfies the class condition. 

for i:=l to length(A) do 
{ 

if not b3 then °/othe sink state q2 is not reached yet 

bl: = true; b2:=false 
else 

skip 

for j:=l to length(A) do 
{ if A[i] .d = A[j] .d then 
{ if A[j].s=a then 

if bl and not b2 and not b3 then 

bl:=false; b2:=true 
else if not bl and b2 and not b3 then 

b2:=false; b3:=true 
else skip 
else skip 

} 

else 

{ if not bl and b2 and not b3 then 
if A[j] .s = b then 

b2:=false; bl:= true 
else skip 
else skip 

} 

> 

} 

An array A satisfies the property iff the Boolean state b\ = true,b2 = false, b3 = false or the state 
b\ = false, b2 = true,b3 = false is reached from the initial state after the run of the above program 
over the array A. 



130 



A decidable extension of data automata 



Theorem 28. The Boolean state reachability problem is decidable for 0-priority restricted-ND2 pro- 
grams. 
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